Normal Probability

for calculus students

A strong understanding of data has become highly valuable in today’s world and data is analysed using statistics. Statistical models of data are built on top of so-called distributions and computations involving continuous distributions often involve computing the area under a curve. Not surprisingly, then, the theoretical and computational foundations of statistics really lie in calculus.

In this document, we’ll take a look at what a distribution really is and how the so-called normal distribution, used so often in elementary statistics, arises in practice. We’ll also see how computations performed by rote in an introductory statistics class really arise from basic integration. Using these techniques, we’ll be able to answer questions arising in games of chance like

Suppose we flip a coin 20 times; what is the probability that we get more than 12 heads?
Suppose we roll a six-sided die 9 times; what is the probability that our sum total exceeds 20?

And we’ll answer those questions in the same way that we approach data-based questions like

If 690 out of 1100 North Carolina voters surveyed plan to vote for Candidate A, what’s the probability that Candidate A will earn more than half the vote?

Continuous and discrete distributions

The function shown in figure 1 is an example of a continuous distribution. To understand this and how it relates to probabilistic computations, we should first examine a few simpler distributions.

Uniform distributions

Suppose we pick a real number randomly from the interval \([0,1]\). What does that even mean? What is the probability we pick \(1\) or \(0.1234\) or \(1/\pi\)? What is the probability that our pick lies in the left half of the interval? One way to make sense of this is to suppose the probability that our pick lies in any particular interval is proportional to the length of that interval. This might make sense if, for example, we choose the number by throwing a dart at a number line while blindfolded. Then, the answer to our second question should be \(1/2.\) The probability that our pick lies in the interval \([0,0.3]\) should be \(3/10.\)

More generally, we can express such a probability via integration against a probability density function. A probability density function is simply a non-negative function whose total integral is 1; i.e.

\[\int_{-\infty }^{\infty } f(x) \, dx=1.\]

In our example involving \([0,1]\) our probability density function would be

\[f(x)=\left\{ \begin{array}{cc} 1 & 0\leq x\leq 1 \\ 0 & \text{else}. \end{array} \right.\]

Then, the probability that a point chosen from \([0,1]\) lies in the left half of the interval is

\[\int_0^{1/2} 1 \, dx=\frac{1}{2}.\]

The probability that we pick a number from the interval \([0,0.3]\) is the area of the darker, rectangular region shown in figure 2.

In some sense, this is a natural generalization of a discrete problem: Pick an integer between 1 and 10 uniformly and at random. In that case, it makes sense to suppose that each number has an equal probability \(1/10\) of being chosen. The probability of choosing a \(1\), \(2\), or \(3\) would be \(1/10+1/10+1/10\) or \(3/10\); this is called a uniform discrete distribution. The sub-rectangles indicated by the dashed lines in figure 2 are meant to emphasize the relationship, since they all have area \(1/10\). A discrete visualization of this is shown in the top of figure 3. The bottom of figure 3 illustrates the uniform discrete distribution on the numbers \(\{1,2,\ldots ,100\}\). Note how the continuous uniform distribution on \([0,1]\) shown in figure 2 appears to be a limit of these discrete distributions, after rescaling.

Now suppose we pick an integer between \(1\) and \(1000\), all with equal probability \(1/1000\). Then the probability of generating a number between \(1\) and \(314\) would be

\[\sum _1^{314} \frac{1}{1000}=\frac{314}{1000}=\int _0^{0.314}1dx.\]

I’ve included the integral here to emphasize the relationship with the continuous distribution. In a real sense, the continuous, uniform distribution on \([0,1]\) is a limit of discrete distributions.

A bell shaped distribution

Next, we’ll generate a bell shaped distribution. To do so, we generate an integer between \(0\) and \(10\) by flipping a coin 10 times and counting the number of heads. There are 11 possible outcomes, but they are not all equally likely. The probability of generating a zero is \(1\left/2^{10}\right.=1/1024\), which is much smaller than \(1/11\). This is because we must throw a tail on each throw and the throws are independent of one another. Since the probability of getting a tail on a single throw is \(1/2\), the probability of getting 10 straight heads is \(1\left/2^{10}\right.\). The probability of generating a 1 is \(10\left/2^{10}\right.\), since the single head could occur on any of 10 possible throws; this probability is ten times bigger than the probability of a zero, yet still much smaller than \(1/11\).

In a discrete math class or introductory statistics class, we would talk carefully about the binomial coefficients:

\[\left( \begin{array}{c} n \\ k \end{array} \right)=\frac{n!}{k!(n-k)!}.\]

This is read \(n\) choose \(k\) and represents the number of ways to choose \(k\) objects from \(n\) given objects. Thus, if we flip a coin \(n\) times and want exactly \(k\) heads, there are \(n\) choose \(k\) possible ways to be successful. If, for example, we flip the coin five times and want exactly two heads, there are

\[\left( \begin{array}{c} 5 \\ 2 \end{array} \right)=\frac{5!}{2!(5-2)!}=10\]

ways to make this happen. These are all illustrated in figure 4. Note that each particular sequence of heads and tails has equal probability \(1\left/2^5\right.\) of occurring. Thus, the probability of getting exactly 2 heads in five flips is \(10/32 = 0.3125\).

More generally, the probability of getting exactly \(k\) heads in \(n\) flips is

\[\left( \begin{array}{c} n \\ k \end{array} \right)\frac{1}{2^n}.\]

We can plot these numbers in a manner that is analogous to the uniform discrete distributions shown in figure 3; the result is shown in figure 5. Note that each discrete plot is accompanied by a continuous curve that approximates the points very closely. There is a particular formula for this curve that defines a continuous distribution, called a normal distribution. This continuous distribution is, in a natural sense, the limit of the discrete distributions when properly scaled. A basic understanding of the normal distribution is our primary objective here. We’ve got a bit more notation we’ll have to slog through first, however.

Binomial distributions together with their normal approximations.

Formalities

Let’s try to write down some careful definitions for all this. The outcome of a random experiment (tossing a coin, throwing a dart at a number line, etc.) will be denoted by \(X\). Probabilists would call \(X\) a random variable. We can feel that we thoroughly understand \(X\) if we know its distribution. The two broad classes of distributions we’ve seen to this point are discrete and continuous leading to discrete or continuous random variables.

Discrete random variables

A discrete random variable \(X\) takes values on a discrete set, like \(\{0,1,2,\ldots ,n\}\) and a discrete distribution is simply a list of non-negative probabilities, like \(\left\{p_0,p_1,p_2,\ldots ,p_n\right\}\) associated with these that add up to one. The uniform discrete distribution, for example, takes all these probabilities to be the same. The binomial distribution weights the middle terms much more heavily. In either case, the probability that \(X\) takes on some particular value \(i\) is simply \(p_i\). To compute the probability that \(X\) takes on one of a set \(S\) of values, we simply sum the corresponding \(p_i\)s, i.e. we compute

\[\sum _{i\in S} p_{i.}\]

Continuous random variables

A continuous random variable \(X\) takes its values in an interval or even the whole real line \(\mathbb{R}\). The distribution of \(X\) is a non-negative function \(f(x)\). To compute the probability that \(X\) lies in some interval \([a,b]\), we compute the integral

\[\int _a^bf(x)dx.\]

Of course, a real valued random variable must take on some value. That is the probability of choosing some number must be one. Thus, we require that \[\int _{-\infty}^{\infty} f(x)dx = 1.\]

Measures of distributions

There are two very general and important descriptive properties defined for distributions, namely the mean \(\mu\) and standard deviation \(\sigma\). We must understand these to understand how the normal distributions are related to the binomial distributions.

Mean and standard deviation for discrete random variables

As we’ve just described, if \(X\) is a random variable taking on values \(\{0,1,\ldots ,n\}\), its distribution is simply the list \(\left\{p_0,p_1,\ldots ,p_n\right\}\) where \(p_k\) indicates the probability that \(X=k\). The mean \(\mu\) of a distribution simply represents the weighted average of its possible values. We express this concretely as

\[\mu =\sum _k k \, p_k.\]

For example, if we choose a number \(\{0,1,2,3,4\}\) uniformly (so each term has probability \(p=1/5\)), then the mean is

\[\mu =\frac{(0+1+2+3+4)}{5}=2,\]

exactly as we’d expect. The mean of the binomial distribution is also “near the middle” but distributions can certainly be weighted otherwise.

The binomial distribution is particularly useful for us, since we ultimately want to understand the normal distribution. Recall that a binomially distributed random variable is constructed by flipping a coin \(n\) times and counting the number of heads. If we flip a coin once, we generate either a zero or a one with probability \(1/2\) each. Thus, the mean of one coin flip is \(1/2\). If we add random variables, then their means add. Thus, the mean of the binomial distribution with \(n\) flips is \(n/2\). This reflects the fact that we expect to get a head about half the time. In fact, the mean is often referred to as the expectation. In this context, the expectation of the random variable \(X\) is written as \(E(X)\).

Standard deviation \(\sigma\), and its square the variance \(\sigma ^2\), both measure the dispersion of the data; the larger the value of \(\sigma\), the more spread out the data. They’re quite similar conceptually but sometimes one is more easy to work with than the other. The variance of a random variable with mean \(\mu\) is defined by

\[\sigma ^2=\sum _k (k-\mu )^2 p_k.\]

Note that the expression \(k-\mu\) is the (signed) difference between the particular value and the average value. We want to measure how large this is on average so we take the weighted average. It makes sense to square first, since we don’t want the signs to cancel.

The variance of our coin flip example is

\[\sigma ^2=\left(0-\frac{1}{2}\right)^2\frac{1}{2}+\left(1-\frac{1}{2}\right)^2\frac{1}{2}=\frac{1}{4}.\]

It follows that the standard deviation is \(\sigma =1/2\). If we add random variables, then their variances add. Thus, the variance of the binomial distribution with \(n\) flips is \(n/4\) and its standard deviation is \(\left.\sqrt{n}\right/2\).

Mean and standard deviation for continuous random variables

The mean, standard deviation, and variance of continuous probability distributions can be defined an a way that is analogous to discrete distributions. The main difference is that we use integration, rather than summation, to define these. Of course, the definite integral is defined as a limit of sums and definite integration is often thought of as a continuous analog of summation. So, perhaps, this will help this all make some sense.

Specifically, the mean \(\mu\) and variance \(\sigma^2\) are defined by

\[\mu = \int _{-\infty }^{\infty }x p(x)dx\]

and

\[\sigma ^2 = \int _{-\infty }^{\infty }(x-\mu )^2p(x)dx.\]

As with discrete distributions, the standard deviation is the square root of the variance.

Suppose, for example that \(X\) is uniformly distributed on the interval \([a,b]\). Thus, \(X\) has distribution

\[p(x)=\left\{ \begin{array}{cc} \frac{1}{b-a} & a\leq x\leq b \\ 0 & \text{else}. \end{array} \right.\]

Then, we can compute the mean as follows:

\[\left.\frac{1}{b-a}\int_a^b x \, dx=\frac{1}{b-a}\frac{1}{2}x^2\right|_a^b=\frac{1}{2(b-a)}\left(b^2-a^2\right)=\frac{a+b}{2}.\]

This is, of course, exactly what we’d expect. In your homework, you’ll show that \(\sigma ^2=\left.(b-a)^2\right/12\). Note that the larger the interval, the larger the variance.

The normal distribution

The most widely used distribution in all of elementary statistics is certainly the normal distribution. In this section, we’ll take a look at its true, mathematical definition as a formula involving the exponential and we’ll see how to deal with integrals involving that formula.

Definition

The formula for the normal distribution with mean \(\mu\) and standard deviation \(\sigma\) is

\[\label{eq:normalDistribution} p(x)=\frac{1}{\sqrt{2 \pi } \sigma }e^{-(x-\mu )^2/\left(2\sigma ^2\right)}.\]

The graphs of several normal distributions are shown in figure 8. Here’s an interactive version as well:

When \(\mu =0\) and \(\sigma =1\) we get the standard normal. Thus, the probability distribution of the standard normal is

\[p(x)=\frac{1}{\sqrt{2 \pi }}e^{\left.-x^2\right/2}.\]

The standard normal is symmetric about the vertical axis in figure 8.

Interpretation as probability

Let’s make sure we understand how this crazy looking \(e^{-(x-\mu)^2/(2\sigma^2)}/(\sqrt{2\pi}\sigma)\) function is related to probability. Suppose that we have a random variable \(X\) that is normally distributed with mean \(70\) and standard deviation \(\sigma = 10\). Suppose we’re curious to know the probability that a value generated by \(X\) lies in the interval between \(65\) and \(80\). In symbols, we want to know \(P(65<X<80)\).

The point behind the normal distribution function is that we can express \(P(65<X<80)\) as \[ P(65<X<80) = \frac{1}{\sqrt{2\pi}\,10} \int_{65}^{80} e^{-(x-70)^2/(2\times10^2)}\,dx. \]

Now, how you might actually compute such an integral is another matter!

Relating normal distributions

Any normal distribution is related to the standard normal distribution because changing \(\mu\) or \(\sigma\) changes the graph of a normal distribution in predictable ways. A change of \(\mu\) simply shifts the graph to the left or right; this changes the mean of the distribution, which is located where the maximum occurs. Reducing the size of \(\sigma\) increases the maximum value and concentrates the graph about that maximum value.

A major challenge in dealing with the normal distribution is that it has no elementary anti-derivative! Elementary statistics courses get around this by providing a table of numerically computed values of

\[ p(x)=\frac{1}{\sqrt{2 \pi}}\int_0^b e^{\left.-x^2\right/2}dx. \]

From that information, one can immediately compute all sorts of integrals involving the standard normal. For example,

\[ \frac{1}{\sqrt{2 \pi }}\int_{-1}^2 e^{\left.-x^2\right/2} \, dx = \frac{1}{\sqrt{2 \pi}}\int_0^1 e^{\left.-x^2\right/2} \, dx +\frac{1}{\sqrt{2 \pi }}\int_0^2e^{\left.-x^2\right/2}dx \]

and both of the integrals on the right can be computed from the table. Furthermore, integrals involving any normal distribution can be computed in terms of the standard normal. While the trick is described in an elementary statistics class, it ultimately boils down to the following formula:

\[\frac{1}{\sqrt{2\pi }\sigma }\int_a^b e^{-\frac{(x-\mu )^2}{2\sigma ^2}} \, dx=\frac{1}{\sqrt{2\pi }}\int _{(a-\mu )/\alpha }^{(b-\mu )/\sigma }e^{\left.-x^2\right/2}dx.\]

Let’s use \(u\)-substitution to verify this. Let \(u=(x-\mu )/\sigma\) so that \(du = \frac{1}{\sigma} dx\). Then,

\[\begin{aligned} \frac{1}{\sqrt{2\pi }\sigma }\int_a^b e^{-\frac{(x-\mu )^2}{2\sigma ^2}} \, dx &= \frac{1}{\sqrt{2\pi }}\int_a^b e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \, \frac{1}{\sigma}dx \\ &= \frac{1}{\sqrt{2\pi }}\int_{(a-\mu)/\sigma}^{(b-\mu)/\sigma} e^{-u^2/2}\,du. \end{aligned}\]

We’ll see several explicit examples illustrating how to use this in the context of a problem as we move through the document.

Obtaining numerical estimates

So, just how do we compute an integral like \[ \frac{1}{\sqrt{2\pi}10} \int_{65}^{80} e^{-(x-70)^2/(2\times10^2)}\,dx \] to, thereby, evaluate \(P(65<X<80)\)?

First, recall that we know that there’s no elementary formula for an antiderivative of the normal distribution. So, ultimately, the computation has to boil down to numerical approximation. That is, our answer won’t look like \(1/\sqrt{\pi}\) or something like that, though, it could look like \(0.56419\).

There are two fundamental approaches this problem:

Use a numerical integrator or
Translate the integral to the integral of the standard normal and, then, look up the result in a table of standard normal integrals.

For calculus students, it’s worth understanding both techniques and, in particular, how the two methods are related.

Numerical integration

Numerical integration, quite generally, is the art of obtaining numerical estimates for definite integrals - a topic of tremendous importance in its own right. It might seem that we are are “just” asking the computer for the numerical value. To be clear, though, there’s plenty of mathematical theory lurking in the background. Details behind these kinds of algorithms are discussed in most introductory courses in Numerical Analysis.

Here’s some code in the Python programming language that uses the numerical libraries NumPy and SciPy to obtain a good numerical estimate to our integral of interest:

Note that the result is a (val, err) pair, where val is the value of the estimate and err is an error estimate, i.e. an upper bound on how far off val is from the actual value. The SciPy quad function obtains its name from quadrature, the ancient art of finding a square with the same area as a given plane figure.

Using a normal table

The other alternative is to use \(u\)-substitution to translate your integral to a standard normal and then look up the value of that standard normal in a normal table. This is the technique that’s typically taught in an introductory statistics class. Ultimately, though, the standard normal table is computed using a numerical integrator and understanding the relationship between the methods is really only attainable with a good knowledge of calculus.

The appendix of this document contains a table of integrals for the standard normal; probabilities arising from all types of normal distributions can be computed using the techniques described here. Ultimately, though, the values in the table are all computed numerically. Thus, with a solid understanding of calculus, it probably makes sense to simply use a numerical integrator in the first place.

Let’s illustrate how we could compute the previous example \[ \frac{1}{\sqrt{2\pi}10} \int_{65}^{80} e^{-(x-70)^2/(2\times10^2)}\,dx \] using a normal table.

Step 1: Convert the integral to a standard normal integral with the substitution \(u=(x-\mu)/\sigma\). In this particular case, \(\mu=70\) and \(\sigma=10\). Thus, our integral becomes \[\begin{aligned} \frac{1}{\sqrt{2\pi}\,10}\int_{65}^{80} e^{-(x-70)^2/(2\times10^2)} \, dx &= \frac{1}{\sqrt{2\pi}}\int_{65}^{80} e^{-\frac{1}{2}\left(\frac{x-70}{10}\right)^2} \, \frac{1}{10}dx \\ &= \frac{1}{\sqrt{2\pi}}\int_{(65-70)/10}^{(80-70)/10} e^{-u^2/2} \, du \\ &= \frac{1}{\sqrt{2\pi}}\int_{-1/2}^{1} e^{-u^2/2} \, du = P(-1/2 < Z < 1). \end{aligned}\]
Step 2: Look up the endpoint values in a standard normal integral (like the one in our appendix) and subtract. In this particular case, we find that
\(P(Z < -0.5) \approx 0.3085\) and \(P(Z < 1) \approx 0.8413\). Thus, \[\begin{aligned} \frac{1}{\sqrt{2\pi}10} \int_{65}^{80} e^{-(x-70)^2/(2\times10^2)}\,dx &= P(65 < X < 80) \\ &= P(-1/2 < Z < 1) = P(Z < 1) - P(Z < -0.5) \\ &\approx 0.8413 - 0.3085 = 0.5328.\end{aligned}\]

The central limit theorem

There are two big theorems in probability theory - the law of large numbers and the central limit theorem; it is the second of these that explains the importance of the normal distribution. Both deal with a sequence of independent random variables \(X_1,X_2,\ldots\) that all have the same distribution. The law of law large numbers simply states that, if each \(X_i\) has mean \(\mu\), then

\[\bar{X}_n=\frac{X_1+X_2+\cdots +X_n}{n}\]

is almost certainly close to \(\mu\). That is, flip a coin a bunch of times and it will come up heads around half the time.

The central limit theorem states more precise information about the distribution of \(\bar{X}_n\). Technically, the central limit theorem states that if each \(X_i\) has mean \(\mu\) and standard deviation \(\sigma\), then the random variable \(\sqrt{n}\left(\bar{X}_n-\mu \right)\) converges to the normal distribution with mean \(0\) and standard deviation \(\sigma\). In practice this means that we can approximate \(S_n=X_1+X_2+\cdots +X_n\) using a normal distribution. Now the mean of \(S_n\) will be \(n \mu\) and its standard deviation will be \(\sqrt{n}\sigma\). Thus, we must approximate using the normal distribution with this same mean and standard deviation. That is

\[\label{eq:centralLimitNormalIntegral} p(x)=\frac{1}{\sqrt{2n \pi }\sigma }e^{-(x-n \mu )^2/\left(2n \sigma ^2\right)}.\]

It is important to understand that the distributions of the \(X_i\) play no role here; all that is important is that they be independent and have the same distributions. Thus, no matter what the distribution of the original \(X_i\)s, their average will be approximately normal!

Examples

Here are a few more examples illustrating the types of computations described in this document. The integrals must be worked out numerically and the text includes Python code (using NumPy/SciPy) to accomplish this, though there are plenty of good numerical integrators.

Note that NumPy represents mathematical \(\infty\) as inf. This is particularly convenient when computing normal integrals because we often have \(\infty\) or \(-\infty\) as a bound of integration. If you’re using a numerical integrator that doesn’t recognize \(\pm\infty\), you should be safe going four or more standard deviations past the mean.

Coin flipping

Suppose we flip a coin 99 times. What is the probability that we get fewer than 47 heads?

Solution: As we’ve seen, the mean and standard deviation of a single coin flip are both \(1/2\). By the central limit theorem, the sum of \(n\) coin flips is approximately normally distributed with mean and standard deviation \(n/2\) and \(\left.\sqrt{n}\right/2\) respectively. Taking \(n=99\), we find that we should evaluate the following integral.

\[\int _{-\infty }^{46.5}\frac{2}{\sqrt{2\ 99\pi }}e^{\left.-(x-99/2)^2\right/(2 99/4)}dx\]

The upper bound of \(46.5\), rather than \(47\) arises as an adjustment to relate the discrete and continuous distributions. This integral must be evaluated numerically; we can do so with Python as follows:

This particular example can also be done using the binomial distribution. In fact, the answer is exactly \[ \frac{1353597022728323255915530247}{4951760157141521099596496896} \]

The normal integral is an approximation, but it is a very good one. The difference between the previous two computations is about \(0.000109944\).

The real power of using the normal distribution arises when we have a very large number of trials - as might happen in a problem in statistical mechanics. For example, what’s the probability of getting fewer than \(500001000\) heads in \(1000000000\) tosses? The binomial approach has half a billion complicated terms in the sum, which might not be practical to compute. The normal integration approach is no harder than it was in the previous example, though. We still need to compute the integral with a numerical integrator:

Can you see how starting at the mean and adding \(0.5\) is equivalent to starting at \(-\infty\)?

Dice

Suppose we roll \(100\) six sided die; what are the odds that our sum total is at least 400?

We can solve this problem by modeling it with a normal distribution. To do so, we first compute the mean and variance associated with one roll of a die. We can then use the additivity of mean and variance to extend that to 100 rolls.

For one roll of a die, the distribution is simply \(p_1=p_2=p_3=p_4=p_5=p_6=1/6\). Thus, we can compute \(\mu\) and \(\sigma\) as follows.

\[\begin{aligned} \mu &= \sum_{k=1}^{6}\frac{k}{6} = \frac{7}{2} \\ \sigma^2 &= \sum_{k=1}^6 (k-7/2)^2/6 = \frac{35}{12} \end{aligned}\]

If we roll 100 such dice, then the outcome is approximately normal with mean \(100\mu\) and standard deviation \(10\sigma\). Thus, the density function is \[ \frac{1}{\sqrt{2\pi }10\sigma} e^{-(x-100\mu )^2/\left(200\sigma ^2\right)}, \] where \(\mu\) and \(\sigma\) are already defined. Thus the probability that our sum is at least 400 is

\[\frac{1}{\sqrt{2\pi }10\sigma }\int_{399.5}^{\infty}e^{-(x-100\mu )^2/\left(200\sigma ^2\right)} dx \approx 0.00187522.\]

Problems

Referring to the table of standard normal integrals on the last page, compute the following.
1. \(\displaystyle \frac{1}{\sqrt{2\pi }}\int_0^{1.3} e^{\left.-x^2\right/2} \, dx\)
2. \(\displaystyle \frac{1}{\sqrt{2\pi }}\int_{-0.4}^{1.3} e^{\left.-x^2\right/2} \, dx\)
3. \(\displaystyle \frac{1}{\sqrt{2\pi }}\int_{0.4}^{1.3} e^{\left.-x^2\right/2} \, dx\)
Using \(u\)-substitution, convert the following normal integrals into standard normal integrals. Then evaluate the integral using the table on the last page or your favorite numerical integrator.
1. \(\displaystyle \frac{1}{\sqrt{2\pi }2}\int_0^1 e^{\left.-(x-1)^2\right/8} \, dx\)
2. \(\displaystyle \frac{1}{\sqrt{2\pi }4}\int_{12}^{18} e^{\left.-(x-10)^2\right/32} \, dx\)
Given that \[\frac{1}{\sqrt{2\pi }}\int_0^{\infty } e^{\left.-x^2\right/2} \, dx=\frac{1}{2},\]

show that

\[\frac{1}{\sqrt{2\pi }\sigma }\int_{\mu }^{\infty } e^{-(x-\mu )^2/\left(2\sigma ^2\right)} \, dx=\frac{1}{2},\]

for all \(\mu \in \mathbb{R}\) and \(\sigma >0\).
Below we see three probability distributions. I used each of these to generate 100 points and plotted the results in figure 11. Match the distribution functions with the point plots.
1. \(\displaystyle \frac{1}{\sqrt{2\pi }0.3}e^{-\frac{(x-1)^2}{2\cdot 0.3^2}}\) over \((-\infty ,\infty )\)
2. \(\displaystyle \frac{1}{\sqrt{2\pi }0.7}e^{-\frac{(x-1)^2}{2\cdot 0.7^2}}\) over \((-\infty ,\infty )\)
3. \(\displaystyle \frac{\log (5)}{24}5^{2-x}\) over \([0,2]\)
For each of the following functions, find the constant \(c\) that makes the function a probability distribution over the specified interval.
1. \(c x(x-1)\) over \([0,1]\)
2. \(c 2^x\) over \([0,\infty ]\)
3. \(c \sqrt{1-(x-1)^2}\) over \([0,2]\)
Compute the mean \(\mu\) and standard deviation \(\sigma\) of the following distributions.
1. The uniform distribution over \([a,b]\)
2. The exponential distribution \(p(x)=e^{-x}\) over \([0,\infty ]\)
3. The standard normal distribution
Suppose we flip a coin 1000 times. Use a normal integral to find the probability that you get more than 666 heads.
Suppose we roll a standard six sided die 12 times. Use a normal integral to find the probability that your rolls total more than 50.
Suppose we roll a fair 10 sided die 10 times. Use a normal integral to find the probability that your rolls total more than 60.

A standard normal table

You can use the table below to compute values of standard normal integrals.

viewof cumulation_type = Inputs.radio([0, 1], {
  label: "Cumulative from:",
  format: (v) => (v == 0 ? "zero" : "-Infinity"),
  value: 1
})

make_normal_table(cumulation_type)

import {make_normal_table} from '@mcmcclur/simple-normal-table'